Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scipy1.4 backend #45

Merged
merged 7 commits into from
Jan 13, 2020
Merged

Add scipy1.4 backend #45

merged 7 commits into from
Jan 13, 2020

Conversation

oleksandr-pavlyk
Copy link
Collaborator

Closes #42

@peterbell10

@oleksandr-pavlyk
Copy link
Collaborator Author

$ MKL_VERBOSE=1 python
Python 3.7.5 (default, Nov 23 2019, 04:02:01)
[GCC 7.3.0] :: Intel(R) Corporation on linux
Type "help", "copyright", "credits" or "license" for more information.
Intel(R) Distribution for Python is brought to you by Intel Corporation.
Please check out: https://software.intel.com/en-us/python-distibution
>>> import scipy.fft
mkl-service + Intel(R) MKL: THREADING LAYER: (null)
mkl-service + Intel(R) MKL: setting Intel(R) MKL to use INTEL OpenMP runtime
mkl-service + Intel(R) MKL: preloading libiomp5.so runtime
MKL_VERBOSE Intel(R) MKL 2020.0 Product build 20191102 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Lnx 2.40GHz lp64
 intel_thread
MKL_VERBOSE SDOT(2,0x562d021b9040,1,0x562d021b9040,1) 1.84ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:16
>>> import mkl_fft._scipy_fft_backend as mkl_be
>>> with scipy.fft.set_backend(mkl_be, only=True):
...     scipy.fft.fft([1,2,3,4])
...
MKL_VERBOSE FFT(dcfi4,bScale:0.25,tLim:1,desc:0x562d022cf680) 7.77us CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:16
array([10.+0.j, -2.+2.j, -2.+0.j, -2.-2.j])
>>> scipy.fft.fft([1,2,3,4])
array([10.-0.j, -2.+2.j, -2.-0.j, -2.-2.j])
>>> quit()

Copy link
Contributor

@peterbell10 peterbell10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)

__all__ = ['fft', 'ifft', 'fft2', 'ifft2', 'fftn', 'ifftn',
'rfft', 'irfft', 'rfft2', 'irfft2', 'rfftn', 'irfftn',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see rfft implemented anywhere? It should match the numpy output format

output = mkl_fft.rfft_numpy(x, n=n, axis=axis)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I will fill this omission.



@_implements(_fft.fft)
def fft(x, n=None, axis=-1, norm=None, overwrite_x=False, workers=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame that you ignore the workers argument everywhere. Could something be done with mkl_set_num_theads?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It need not be the case. I can support workers by means of using mkl-service, using domain_set_num_threads (ref).

The only thing I am unsure of, is how to make certain setting and restoring number of threads per domain is interrupt free (so the a user inadvertently ends up with MKL in state where the number of threads for FFT domain has been reset from the previous value).

Will the following guarantee that the number of threads be reset even in the case of keyboard interruption?

try:
    saved = mkl.domain_get_max_threads(domain='fft');
    mkl.domain_set_num_threads(workers, domain='fft')
    res = mkl_fft.fft(inputs..)
finally:
    mkl.domain_set_num_threads(saved, domain='fft')
return res

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be okay, python generally raises a KeyboardInterrupt exception so the finally clause should be run.

This SO post has some cautionary remarks though:
https://stackoverflow.com/questions/41280318/will-python-execute-finally-block-after-receiving-ctrlc

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbell10 Thank you for your feedback. I think I implemented all of it now.


@_implements(_fft.fft)
def fft(x, n=None, axis=-1, norm=None, overwrite_x=False, workers=None):
output = _pydfti.fft(x, n=n, axis=axis, overwrite_x=overwrite_x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scipy.fft upcasts np.float16 in the same way as scipy.fftpack. Might need the upcast function?

x = _float_utils.__upcast_float16_array(a)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I added the call.

@oleksandr-pavlyk oleksandr-pavlyk merged commit 357a0b7 into master Jan 13, 2020
@oleksandr-pavlyk oleksandr-pavlyk deleted the add-scipy1.4-backend branch January 13, 2020 16:58
Comment on lines +103 to +106
def _workers_to_num_threads(w):
if w is None:
return mkl.domain_get_max_threads(domain='fft')
return int(w)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workers has a slightly more complicated meaning than literally just the number of threads. It can also be negative e.g. -1 for all threads, -2 for all but 1. Ideally, mkl_fft should respect the fft.set_workers context manager which controls the default number of threads. You can use fft.get_workers() to access the current value.

pyfftw implements all this with a _workers_to_threads helper function:
https://github.com/pyFFTW/pyFFTW/blob/011a808c4e480f5674d79113bfb10662c103b3d9/pyfftw/interfaces/scipy_fft.py#L100-L115

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @peterbell10.

Unfortunately Intel(R) MKL's assessment of the maximal number of threads it can run disagrees with that of os.cpu_count():

(t_scipy-1.4.0) [11:46:11 linbox mkl_fft]$ ipython
Python 3.7.5 (default, Nov 23 2019, 04:02:01)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.11.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import os, mkl

In [2]: ( os.cpu_count(), mkl.get_max_threads() )
Out[2]: (32, 16)

And MKL would clip large value to the maximum number of threads possible:

In [3]: mkl.set_num_threads(30)
Out[3]: 16

Moreover, the mkl.get_max_threads() can be controlled via environmental variable:

(t_scipy-1.4.0) [11:49:08 linbox mkl_fft]$ ipython -c "import mkl; print(mkl.get_max_threads())"
16
(t_scipy-1.4.0) [11:49:12 linbox mkl_fft]$ MKL_NUM_THREADS=8 ipython -c "import mkl; print(mkl.get_max_threads())"
8

So I will be using mkl.get_max_threads() to obtain the maximum value relative to which to interpret the negative workers value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in #47

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be using mkl.get_max_threads() to obtain the maximum value relative to which to interpret the negative workers value.

That sounds a bit problematic to me since a library using the scipy.fft interface can't really be sure if a negative workers value is valid or not. Perhaps it would be better to follow os.cpu_count(), let mkl.set_num_threads clip the value and just give a warning?

cc @larsoner, @rgommers any thoughts on this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made that change in #47

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide an interface for scipy.fft
2 participants